[TF] Reimplement unbroadcast using on-host axis calculation for performance. by rxwei · Pull Request #24907 · swiftlang/swift

rxwei · 2019-05-20T01:14:17Z

The inefficiency of unbroadcast(toShape:), unbroadcast(to:), and unbroadcast(like:) has caused significant performance problems during model training because it's performing a lot of TensorFlow operations to achieve axis calculation. We were forced to implement it this way in the early GPE era when neither send/receive nor per-op dispatch was available.

This PR reimplements the unbroadcast operations in terms of host-side logic to compute axes to reduce along. This significantly reduces the TensorFlow opreation dispatch overhead. The base implementation changed from broadcast(toShape:) to broadcast(to:).

With the new implementation, differentiating broadcasting operators is 37% faster (see simple test script here).

Note:

Since we now rely on the TensorFlow runtime less, more precondition checks and assertions are added to the newly implemented unbroadcast(to:) method.
The part of [TF] Remove unbroadcast(to:) and improve derivative performance. #24408 that uses Raw.broadcastGradientArgs(s0:s1:) is still necessary for broadcasting binary operations to become faster.

TODO:

Change unbroadcast(toShape:) tests added by [AutoDiff] Add more Tensor broadcast/unbroadcast differentiation tests. #24899 to use unbroadcast(to:), since unbroadcast(to:) is now the base implementation.

…rmance. The inefficiency of `unbroadcast(toShape:)`, `unbroadcast(to:)`, and `unbroadcast(like:)` has caused significant performance problems during model training because it's performing a lot of TensorFlow operations to achieve axis calculation. We were forced to implement it this way in the early GPE era when neither send/receive nor per-op dispatch was available. This PR reimplements the unbroadcast operations in terms of host-side logic to compute axes to reduce along. This significantly reduces the TensorFlow opreation dispatch overhead. The base implementation changed from `broadcast(toShape:)` to `broadcast(to:)`. With the new implementation, differentiating broadcasting operators is 37% faster (see simple test script [here](https://gist.github.com/rxwei/e1488cac5379ba2bc3aff7490e18158f)). Note: - Since we now rely on the TensorFlow runtime less, more precondition checks and assertions are added to the newly implemented `unbroadcast(to:)` method. - The part of swiftlang#24408 that uses `Raw.broadcastGradientArgs(s0:s1:)` is still necessary for broadcasting binary operations to become faster. TODO: - Change `unbroadcast(toShape:)` tests added by swiftlang#24899 to use `unbroadcast(to:)`, since `unbroadcast(to:)` is now the base implementation.

rxwei · 2019-05-20T01:20:47Z

@swift-ci please test tensorflow

dan-zheng

Big 👍 to empirical benchmarking!

rxwei added the tensorflow This is for "tensorflow" branch PRs. label May 20, 2019

rxwei requested review from bartchr808 and dan-zheng May 20, 2019 01:14

rxwei changed the title ~~[TF] Reimplement unbroadcast using on-host axis calculation.~~ [TF] Reimplement unbroadcast using on-host axis calculation for performance. May 20, 2019

rxwei force-pushed the efficient-unbroadcast branch from 356e4d4 to 4513fa4 Compare May 20, 2019 01:16

dan-zheng approved these changes May 20, 2019

View reviewed changes

rxwei merged commit 528fb67 into swiftlang:tensorflow May 20, 2019

rxwei deleted the efficient-unbroadcast branch May 20, 2019 02:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TF] Reimplement unbroadcast using on-host axis calculation for performance.#24907

[TF] Reimplement unbroadcast using on-host axis calculation for performance.#24907
rxwei merged 1 commit intoswiftlang:tensorflowfrom
rxwei:efficient-unbroadcast

rxwei commented May 20, 2019 •

edited

Loading

Uh oh!

rxwei commented May 20, 2019

Uh oh!

dan-zheng left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

rxwei commented May 20, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rxwei commented May 20, 2019

Uh oh!

dan-zheng left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rxwei commented May 20, 2019 •

edited

Loading